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In this paper we explore how students construe what it means for an informal argument to be the 
basis of a formal proof and what students pay attention to when assessing whether a proof is based 
on an informal argument. The data point to some undergraduate mathematics students having 
underdeveloped conceptions of what it means for a proof to be based on an argument. These 
underdeveloped conceptions limit what students pay attention to during informal-to-formal 
comparison tasks and may have adverse effects on students’ ability to use their own informal 
arguments to construct proofs. 
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Introduction 

The constructs we use throughout this paper revolve around the observation that proofs are 
expected to be written in a verbal-symbolic representation system but may not be generated wholly 
within that system. Following Weber and Alcock (2009) we refer to reasoning that stays solely 
within this system as syntactic reasoning, and reasoning that falls outside of it as semantic reasoning. 
Similarly, we conceptualize a formal proof as a deductive argument that establishes the result to be 
proven and conforms to the norms of the representation system of proof. This is a characterization of 
the end product, not the reasoning that led to it. Additionally, we conceptualize an informal argument 
as a deductive argument that establishes the result to be proven, but does not conform to the norms of 
the representation system of proof. We refer to the use of informal arguments to inform the 
construction of formal proofs, or more generally the process of using semantic reasoning to inform 
syntactic reasoning, as formalization. 

Research relevant to formalization can be partitioned into two non-disjoint categories. The first 
category focuses on the role of semantic reasoning in informing proof productions. The second 
category examines the semantic-to-syntactic formalization process needed to use semantic reasoning 
to generate a formal proof. 

Research falling into the first category has illustrated the important role that semantic reasoning 
can play in proof generation. Various types of semantic reasoning have been shown to inform proof 
generation (Gibson, 1998, Sandefur, Mason, Stylianides, & Watson, 2013, Zazkis, Weber, & Mejia- 
Ramos, in press). This first body of work has perpetuated the recommendation that students should 
use semantic reasoning during proof construction. This recommendation has gained considerable 
traction in mathematics education (e.g., Garuti, Boero, & Lamut, 1998; Raman, 2003). 

A second set of studies has focused on the formalization process itself. This research has 
provided evidence that mathematics majors struggle to use semantic reasoning to inform proof 
generation (Selden & Selden, 1995; Alcock & Weber, 2010; Zazkis et. al., in press). 

These two sets of studies point to a discrepancy between researcher recommendations (that 
students should generate proofs using informal arguments) and students’ behavior and abilities. In 
order to better understand this discrepancy we examine what mathematics majors pay attention to 
when attempting to determine if a formal proof is based on an informal argument and how these 
determinations compare to normatively correct interpretations. 

In order to operationalize these notions we consider a formal proof to be based on an informal 
argument if there is a mapping between these two chains of inferences that has two properties: (1) it 
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is meaning preserving to the extent allowed for by the rules of the verbal-symbolic system, and (2) 
corresponding inferences (or chains of inferences) appear in the same order. We use the acronym 
FBI-judgment, to refer to student judgments of whether Formal proofs are Based on Informal 
arguments. 


The Study 

We are interested in what it means for a proof to be based on an informal argument from a 
mathematics major’s perspective. Thus we create a model of what mathematics majors pay attention 
to when making FBI-judgments and use this model to create a plausible explanation for why these 
students may have difficulty with formalizing informal arguments. In particular, we want to illustrate 
how prioritizing a particular subset of attributes when making informal to formal comparisons 
influences how students view what it means for a proof to be based on an argument and by extension, 
affects their ability to formalize. 


Participants 

Participants were pursuing undergraduate degrees in mathematics at a large state university in the 
Northeastern United States and had completed a proof based second course in linear algebra, an 
introduction to proof course, and an introductory analysis course. Thus, the eight participants were 
familiar with reading and writing proofs in a variety of mathematical contexts. Participants were 
selected to have roughly equal amounts of A, B, and C grades represented. 


Procedure 

The second author conducted one-on-one clinical interviews with each participant that lasted 
between 90 and120 minutes. This involved presenting participants with both informal arguments and 
proofs and engaged them in a series of comparison tasks. More specifically, he presented participants 
with triples that consisted of one informal argument and two formal correct proofs, only one of 
which, from our perspective, was based on the argument. 

The informal argument in each triple was presented in the form of a video. Each video lasted 
approximately 30 seconds and involved the first author justifying the result with a combination of 
verbal argumentation, graph generation and gestures. The two correct formal proofs in each triple 
were presented in written form. 

At the beginning of each one-on-one interview participants were told that they would be shown 
triples. They were told that they were to understand and compare the three parts of the triples, but 
that they did not need to validate correctness. In the first round of the interview participants were 
shown each of the three triples one at a time. They watched the video and read the two proofs out 
loud. After participants verified that they felt they understood the result, they were asked to make 
side-by-side comparisons of what they noticed in terms of similarities and differences between the 
three parts of the triple. This was done with each of the three triples. Note that during the first stage 
only the participants’ impressions and what they noticed was elicited. 

The second stage involved revisiting each triple and judging whether each of the proofs in the 
triple was based on its informal argument, how confident they were in their assessment and on what 
information they based their conclusion. Students were not informed that, from the researchers point 
of view only one of the proofs in each triple was a formalization of the informal argument. This made 
it possible for participants to conclude that neither or both of the proofs in each triple were based on 
the informal argument. 


Analysis 
A set of minimum criteria, which was agreed upon prior to the interviews, was used to determine 
whether student FBI-judgments were consistent with mathematical norms. We were also interested in 
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what students paid attention to when making FBI-judgments. Each interview was transcribed and 
grounded theory (Strauss & Corbin, 1990) was used to categorize interviewee responses in terms of 
what they paid attention to during FBI-judgments. 


Materials 
We briefly mention that Task | involved proofs that fi ; sin’(x)dx =0, for all real numbers a, 


and Task 3 involved proofs that the derivative of a differentiable even function is odd. For space 
reasons we discuss only Task 2 in detail. To clarify our discussion each step in the informal 
arguments is labeled with an “I2,” each step in the formalization of this argument is labeled with an 
“F2,” and each step in the distractor proof is labeled with a “D2.” 


Unlike F», D, is not intrinsically an argument by contradiction. The contradiction was artificially 


If a sequence (a,) has a limit, it is unique. 
Informal argument Proof F, (Formalization of informal argument) 
(12-1) If the limit wasn’t 
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and Lp. Proof D, (Distractor) 
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Figure 1: If a sequence (a,) has a limit, it is unique. 


added to proof D, to make D> and F) superficially similar. F2 starts off assuming, toward a 
contradiction, that the limit is not unique (I,-1 and F,-1) and because of this we may choose two 
limits, L; and Lj (I,-1 and F,-2). Although the assumption that L; > L2 (F2-2) does not explicitly 


appear in the informal argument, it is implied by the accompanying diagram. Next I,-2 and I,-3 argue 
that the e-neighborhood around L, and L2 can be made small enough to not overlap. In proof F., this 
“not overlapping,” is achieved by choosing € to be exactly half the distance between L, and L» (F.- 
3), and then showing this choice of € places a, both above and below the midpoint (F,-4) for all n 
sufficiently large. Finally, both Proof F2 and the informal argument end by arguing that being in two 
places at once leads to a contradiction (F,-5 and I,-4/5). In the proof this is done formally by arguing 
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that a term in the sequence, a,, cannot be both above and below the mid point. Proof D2 demonstrates 
that any two limits of a sequence can be made arbitrarily close to each other, and thus must be equal, 
but does not rely on the “two places at once” idea. 


Results 

A top-level view of the connections the participants made relative to our pre-agreed standards 
and hence, from our perspective, made normatively correct FBI-judgments for normatively correct 
reasons can be found in Table 1. As can be seen from the table, the first task was relatively 
unproblematic for the students in this study. The other two tasks were more difficult. Only 2 of the 
eight students met our minimum standard on both parts of task 2 and none of our students met our 
minimum standard on both parts of task 3. 

These data point to mathematics majors’ difficulties with FBI-comparison tasks. The ability to 
recognize the final product of formalization is crucial. Without this a student cannot recognize when 
the formalization process is complete and thus cannot effectively formalize. 


Table 1: Student FBI-judgments relative to the minimum standard 


Task 1 Task 2 Task 3 
Met criteria for D-I judgment 7/8 5/8 2/8 
Met criteria for F-I judgment 8/8 3/8 1/8 
Met both F-I and D-I criteria 7/8 2/8 0/8 


A model of what students pay attention to when making FBI-judgments 

In our analysis we identified four different aspects of arguments/proofs that students focused on 
when making FBI-comparisons. Two of the foci outlined below are adaptations of Pedemonte’s 
(2007) structural distance and content distance constructs. The reframing of these constructs was 
necessary because the focus of Pedemonte’s research is different from our own. The four foci of 
comparison are described below: 

(1) Structural foci involve noticing global similarities and differences in which inferences follow 
from one another (i.e., structure). We conceptualize this as an adaptation of Pedemonte’s (2007) 
notion of “structural distance” to a FBI-comparison context. In a broad sense structural foci can be 
seen as an attempt to evaluate what kind of relationship exists between the structure of an informal 
argument and the structure of a formal proof. 

(2) Content foci involve noticing which specific elements (i.e., inferences, assumptions, data and 
claims) are present or not present within both an argument and proof. This can be seen as an attempt 
to evaluate the relationship between the content of an informal argument and the content of a formal 
proof. This focus can be seen as an adaptation of Pedemonte’s (2007) notion of “content distance” to 
an FBI-comparison context. 

(3) Methodological foci involve noticing the proof method used (e.g., contradiction, 
contrapositive, induction, construction, etc.) as well as the role this method plays in the proof. 

(4) Holistic foci involve noticing similarities and differences in terms of goals, style purpose or 
overarching idea. These comparisons focus on proofs and arguments as a whole and overlook 
specific structural, content and methodological details. 

Here we show that prioritizing one of the four foci of comparison in lieu of others has detrimental 
effects on students’ ability to make informal to formal comparisons. It is important to note that these 
foci are not necessarily static. Some students may shift foci when moving to a different task. 

Content foci. Next we discuss content foci. This involves paying attention to the assumptions 
and inferences within a proof/argument, but largely overlooking the roles and structure of these 
elements within a proof. Here the focus is localized to specific steps in the proof. In other words, the 
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details of a proof are examined, but the bigger picture is ignored. We illustrate this focus with an 
excerpt of S8’s work on Task 2. In the excerpt below S8 is asked to compare D» and F;: 


Int: Any... any other differences you can see? Looking at the proofs again? 

S8: Umm... L; is defined to just not be equal to L2 in proof D3 and in proof F3, they say L; is 
greater than Lp. 

Int: I, I guess, ... do you consider those significant differences? The ones that you mentioned? 

S8: Yea, yea definitely. Those are significant differences. 


The assumption that L|>L is not explicitly made in the lh, however, it is implied by the 
accompanying diagram. Following the above excerpt S8 proceeded to use the fact that the L|>L, 
assumption is part of proof D> and not F2 as a justification for why the distractor proof is based on 
the informal argument while the formalization proof is not. In focusing in on a particular piece of 
content which is present in only one of the proofs he overlooks the bigger picture and consequently, 
makes an FBI-judgment that is in conflict with the normatively correct interpretation. Hence, we 
contend that prioritizing content in leu of structure is also insufficient for making normatively correct 
FBI-judgments. 

S8’s assessment is consistent with his content foci. He is looking for specific inferences that are 
present in the proofs in order to compare them to the informal argument. Thus, his expressed de facto 
conception of what it means for a formal proof to be based on an informal argument involves the 
formal proof using similar assumptions and similar inferences to the informal argument. Within this 
conception, a difference in assumptions used is sufficient evidence that a proof is not based on an 
informal argument. 

Methodological foci. It is useful to note that our anticipation of methodological foci influenced 
our task design. Methodological foci were the motivation for making proof D artificially a proof by 
contradiction. If we had not artificially made D, a proof by contradiction students may have 
concluded that D» was not based on Ip solely based on the fact that I2 is a proof by contradiction 
without working to make other connections. Thus our task design intentionally discouraged surface 
level methodological foci. However, one participant did notice this feature of D2: 


S7: Wait, what? ... There is no point in this [D2] being a proof by contradiction. That is 
completely redundant I could have just crossed this out here, “Assume it does not have a 
unique limit.” You can cross that out. 


Our task design intentionally prevented superficial methodology based assessments but left the 
tasks open to deeper assessments like the one in the excerpt above. Since only one of the participants 
noticed this feature of D2 , we argue that artificially adding or removing particular methodologies 
from proofs has the potential to lead students to make incorrect FBI-judgments. That is, if we used a 
direct proof as a second distractor in place of F>, we anticipate that the majority of students in our 
study would have incorrectly used “only one of these is a proof by contradiction” as a justification 
for why D2 was based on Ip. This highlights the limitations of a strictly methodological foci. 

Holistic foci. The final foci we discuss involves examination of holistic traits. The word trait here 
is construed broadly and may include attribute such as elegance, efficiency, style, pedagogical 
purpose or overarching idea. In short, this is intended to capture any treatment of a proof as more 
than the sum of its parts. Proofs have purposes and can be qualitatively compared to both each other 
and to the general genre of proof writing. 

First we begin by discussing the work of S1 on task 1. The excerpt below begins after S1 reads 
proof D, (He has already read proof F;). 
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S1: I feel like D; was kind of lamer than the other one 

Int: Lamer? 

S1: This one [F;] was a little prettier, it was... I mean over here, we had uh, we were using that it 
was... this [D,] felt very... brute force 


S1 treats the two proofs in task | as aesthetic entities and does not solely focus on the internal 
(line-by-line) workings of the proofs. He expresses the belief that elegant proofs are more desirable 
than brute force proofs and judges D, as less desirable. Later in his interview, when he was asked to 
compare F, and Dj, he discussed the two proofs relative to the genre of proof as a whole. 


S1: Okay, what do they have in common? Clearly they have the goal in common, but the guy on 
the left, proof D;, proof D, felt more like uh... I don’t really know if there’s an actual 
distinction in the math world between a “prove something” and a “show something,” but if 
there was, this [D,] definitely feels like, you know, just show that it’s 0. But this [F;] was like 
a really... this felt like it had more behind it here... whereas this [D,] was like, let’s just 
evaluate it and see where that takes us. Okay? Which is fair, you know? It just doesn’t give 
you any insight into why that’s the case. 


S1 expresses the belief that proof D;, does not provide any mathematical insight regarding why 
the result holds. It is simply an exercise in implementing well-established calculation techniques. 
Implicitly, he expresses that he often looks for what insights he can gained from presented proofs, in 
this case he did not find any. 

However, one cannot effectively make FBI-judgments by focusing on holistic attributes of a 
proof in lieu of other attributes. For example, there may be multiple elegant arguments that justify a 
result. Thus, elegance alone is insufficient for making comparisons. 

Multi-focus comparisons. In the previous subsections we argued that prioritizing only one of the 
foci of comparison in lieu of others was insufficient for making normatively correct based on 
judgments. In this section we illustrate what comparisons that utilize all four foci look like and how 
they may yield normatively correct FBI-judgments. To clarify we are arguing that balancing ones 
attention to these foci greatly increases the likelihood that a student consistently generates correct 
FBI-judgments but does not guarantee normatively correct judgments. 

Below we examine S7’s immediate reaction after reading proof F, for the first time: 


Int: General impressions. 

S7: F2 is just literally the proof version of I2.... Uh, so the idea behind this is that, okay if we are 
trying to show that this sequence has a unique limit, which we want to show that it can't have 
two limits. So we suppose there is two limits, basic proof by contradiction. So both are proofs 
by contradiction. And the contradiction occurs when epsilon is small enough. Here they show 
it intuitively but it's pretty clear from the picture that what they use was a number that's less 
then half way in between. Here [F)] is that function, the average. So once we have the. It's not 
the average it's close enough so that it doesn't even reach the average. And that way the two 
have no overlap. And then by definition of sequence it should eventually get far enough that 
it's in this region always and once you get far enough it's in this region always but then it's 
therefore always in both these regions once it passes that specific end that we defined. And 
that's the contradiction. Which is what they said here. When we get small enough down we 
can't be in both but it has to be in both. 


It is important to emphasize that S7 realizes that F, is based on I, before he is asked to make any 
kind of comparison. The above is simply his initial response. The part of the interview where he will 
be specifically asked FBI-questions occurred 30 minutes later. Also, he immediately jumps into the 
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comparison when he states that both the proof and the informal argument have the same idea behind 
them (holistic foci). He then shifts to discussing how this idea manifests itself in terms of structure of 
both I, and F» (structural foci). He notes that both the proof and informal argument are necessarily 
arguments by contradiction with the contradiction in both cases being that you cannot be in two 
places at once (methodological foci). This is then related to the specifics of the proof, with being both 
above and below the midpoint of L; and L»2 corresponding to being in two places at once in the 
informal argument (content foci). S7 makes all four types of comparisons, does this without any 
specific prompting to make a comparison and relates the four types of comparison foci in his 
discussion. 

We believe that the fact that S7 saw the relationship between I, and F, before he was asked to 
compare them to be particularly important. Mathematics is often discussed metaphorically as a 
language. Here S7 recognized that the informal argument and proof were metaphorically telling the 
same story. This is akin to being shown two paragraphs that tell the same story in two different 
languages, both of which one is fluent in. The fact that the same story has been presented twice, as 
well as the multitude of parallels between its two presentations is salient even without being asked to 
compare the two paragraphs. 

On the other hand, if one is learning the second language and is asked the same question the 
comparison is very different. The comparison becomes an exercise in finding parallels between the 
words and phrases used, as well as, the order in which these appear. In this case one is likely to grasp 
onto only a fraction of the similarities and differences between the two paragraphs and make a 
determination based on only this subset. This is analogous to what we observed students doing when 
they prioritized one of the foci over others. 


Discussion 

This paper contributes to the literature on proof and proving both methodologically and 
theoretically. First, from the perspective of theory, this paper introduced a four-part model of the 
aspects of arguments/proofs students focus on when attempting to determine whether a particular 
argument is based on a particular proof. The components of this model are content foci, structural 
foci, methodological foci and holistic foci. We illustrated that comparisons where students prioritized 
one of these categories of comparison in lieu of others were prone to incorrect or incomplete 
conclusions regarding whether a proof was based on an informal argument. Furthermore failure to 
see the rich connections between informal arguments and proofs point to students having under 
developed conceptions of what if means for a proof to be based on an informal argument. These 
underdeveloped conceptions account for difficulties students have with generating proofs based on 
informal arguments (e.g., Zazkis et. al., in press) and in understanding the connections between 
informal arguments and proofs presented in lecture (e.g., Lew et al., 2014). These underdeveloped 
conceptions also account for some of students’ resistance to generating informal arguments during 
proof production. 

Methodologically, the triples method introduced in the study is a valuable research tool for those 
interested in research on the connections between informal arguments and formal proofs. Examining 
how students compare and contrast ready-made informal arguments and formal proofs provides 
valuable insights regarding what they notice when making FBI-judgments. In turn, what students’ 
notice during FBI-judgments provides a valuable lens into how they conceptualize formalization and 
how they might view formalizing their own informal arguments. This method was able to reveal that 
students’ conceptions of what it means for a proof to be the basis of an informal argument are not as 
rich as an expert conception—often only encompassing a fraction of the connections that exist 
between informal arguments and formal proofs. 
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